Computers / Programming / Projects / Code Formatter / State Transition

State transitions are defined in state elements in the type’s xml file. They are used to indicate when a different state should become active. States can apply specific formatting to a section and control when other transitions or keyword highlighting can occur. For example in most languages keywords aren’t highlighted within strings or comments but those sections do want different formatting.

cpp.xml
<state name="Block comment" id="2" type="includeStart" colour="darkGreen"
allowNestedStates="false" useEscapeChar="false"
start="/*" end="*/" >
<startTypeIds>
<startType type="cpp" id="0" />
</startTypeIds>
<words></words>
</state>
14
15
16
17
18
19
20
21

The state element has a set of attributes on it which define how the state transition works. The startType elements define the type name and state id combination in which the language transition can start. For example you can’t start a comment in the middle of a quoted sequence

Attribute Description Default Value
name The descriptive name of the state “”
id A number used to identify the state to other transitions 0
type Controls how the state is handled. The options are
includeStart – state colour is applied before the start sequence and lasts until the end
excludeStart – state colour is applied after the start sequence and lasts until the end
includeStartWord – state colour is applied only to the start and end sequences
NoColour – state colour is not applied
pre
colour The colour to use for the state “”
allowNestedStates Controls whether or not state transitions should be tested when in this state false
useEscapeChar Indicates if the start or end sequence should be checked to see if it’s escaped using the type’s defined escape character false
start The character sequence used to trigger state transition. “”
end

The character sequence used to trigger the transition back to the previous state. The special “~line” sequence indicates the state ends at the end of the line and the “~word” sequence indicates it ends at the end of the word.

“”
regEx regular expression used to ensure only whole words are detected "[a-zA-Z0-9]"
words If set to sym this is the last of words that can be included in the start sequence. The special “~word” sequence means all words that meet the regEx are valid.  

Code

State.h

h type icon
Type: Header file
Language: C++
State.h

State.cpp

cpp type icon
Type: Code file
Language: C++
State.cpp

The State class stores information about state transitions. A state is used to control what formatting conditions are valid. It contains methods to detect the start and end of a state and to print the start and end.

the State() constructor creates a default state that’s used as the initial state for all languages.

The Lang(xmlNodePtr xmlNode) constructor first sets the State attributes to their default values and then uses the passed in XML object to decode the attributes from the type file along with start and end TypeIdPairs.

The IsStart(std::string line, int pos, TypeIdPair testType, std::string escape) method compares the start sequence for the state transition against the characters in the current line at the specified position. If the start sequence matches it then loops through the list of allowed start type name and state ids to see if any of them match the passed in pair. If useEscapeChar is true it then tests to see how many times the passed in escape character appears before the start sequence. If the start sequence matches, a type name and state id pair is found and there is an even number of escape characters before the start sequence or useEscapeChar is false the method returns true, otherwise it returns false.

The IsEnd(std::string line, int pos, TypeIdPair testType, std::string escape) method compares the end sequence for the language transition against the characters in the current line at the specified position. If useEscapeChar is true it then tests to see how many times the passed in escape character appears before the start sequence. If the end sequence matches and there is an even number of escape characters before the end sequence or useEscapeChar is false the method returns true, otherwise it returns false.

The PrintStart(std::stringstream& lineStream, std::string line, int pos) method prints the start sequence of a state. If the state is includeStartWord or includeStart a span is started and then the start sequence is printed to lineStream. If the state is excludeStart and not marked to end after a word then the span is started after the start sequence is printed. If the state is includeStartWord then it tries to find a valid state word and closes the span. If the state is marked to end after a word then it tries to find a word from line and prints it before closing the span. The length of the start sequence and any words printed is returned.

The PrintEnd(std::stringstream& lineStream) method prints the end sequence of a state. If the state is includeStartWord a span is started and then the end sequence is printed to lineStream. If the state is not NoColour a closing span is then printed. The length of the end sequence is returned.

The PrintRestart(std::stringstream& lineStream) method is called when a state is re-opened after being temporarily halted to display something else. It prints the opening span to lineStream unless the state is includeStartWord or NoColour.

The PrintRestart(std::stringstream& lineStream) method is called when a state needs to be temporarily halted to display something else. It prints a closing span to lineStream unless the state is includeStartWord or NoColour.

The FindStateWord(std::string line, int pos) method loops through the lists of words defined on the state. If a word in the list matches what is at the current line position and isn’t a part of a larger word it returns the word. If the special “~word” sequence is found in the list it uses the regular expression to find the word starting at the current position in the line and returns it.